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Abstract 

We study probabilistic bit-probe schemes for the membership problem. 
Given a set A of at most n elements from the universe of size m we 
organize such a structure that queries of type "a; £ A?" can be answered 
very quickly. 

H. Buhrman, P.B. Miltersen, J. Radhakrishnan, and S. Venkatesh pro- 
posed a bit-probe scheme based on expanders. Their scheme needs space 
of 0(n log m) bits. The scheme has a randomized algorithm processing 
queries; it needs to read only one randomly chosen bit from the memory 
to answer a query. For every x the answer is correct with high probability 
(with two-sided errors). 

In this paper we show that for the same problem there exists a bit- 
probe scheme with one-sided error that needs space of 0(n log 2 m + 
poly (log m)) bits. The difference with the model of Buhrman, Miltersen, 
Radhakrishnan, and Venkatesh is that we consider a bit-probe scheme 
with an auxiliary word. This means that in our scheme the memory is 
split into two parts of different size: the main storage of 0(n log 2 m) bits 
and a short word of log ' 1 ' m bits that is pre-computed once for the stored 
set A and "cached". To answer a query u x £ A?" we allow to read the 
whole cached word and only one bit from the main storage. For some 
reasonable values of parameters (e.g., for poly(logm) <C n <C m) our 
space bound is better than what can be achieved by any scheme without 
cached data (the lower bound f2( " lo '°^ m ) was proven in [18]). We obtain 
a slightly weaker result (space of size n 1+<5 poly(log m) bits and two bit 
probes for every query) for a scheme that is effectively encodable. 

Our construction is based on the idea of naive derandomization, which 
is of independent interest. First we prove that a random combinatorial 
object (a graph) has the required properties, and then show that such a 
graph can be obtained as an outcome of a pseudo-random bits generator. 
Thus, a suitable graph can be specified by a short seed of a PRG, and we 
can put an appropriate value of the seed into the cache memory of the 
scheme. 



•Supported in part by grants ANR EMC ANR-09-BLAN-0164-01 and NAFIT ANR-08- 
EMER-008-01. 
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1 Introduction. 



We investigate the static version of the membership problem. The aim is to 
represent a set A C {l,...,m} by some data structure so that queries "cc € 
A?" can be easily replied. We are interested in the case when the number 
of elements in the set n = \A\ is much less than size m of the universe (e.g., 
n = exp{poly(loglogTO)} or n = m 01 ). 

In practice, many different data structures are used to represent sets: simple 
arrays, different variants of height-balanced trees, hash tables, etc. The simplest 
solution is just an array of m bits; to answer a query "x G AT 1 we need to read 
a single bit from the memory (the x-th. bit in the data storage is equal to 1 if 
x £ A). However, the size of this data structure is excessive: it requires m bits 
of memory, while there exist only (™) = 2 e (™ lo s m ) different subsets A of size n 
in the m-elements universe. 

Nowadays the standard practical solution for the membership problem is a 
more complex data structure proposed by Fredman, Komlos, and Szemeredi [6]. 
This scheme is based on perfect hashing; a set is represented as a table of 
0(n) words (hash- values) of size logm bits, and a query "x € A?" requires to 
read 0(1) words from the memory. The space complexity of this construction 
is quite close to the trivial lower bound il(log ('")). The asymptotic of the 
space complexity of this scheme was further improved in I15[ 116] . Similar 
space complexity was achieved in dynamic data structures, which support fast 
update of the set stored in the database (see, e.g., analysis of the cuckoo hashing 
scheme in 26, 25J). A subtle analysis of the space and bit-probe complexity for 
the membership problems was given also in |20j (in particular, |20j suggested 
a membership scheme based on bounded concentrator graphs). Note that all 
these schemes require to read from the memory O(logm) bits to answer each 
query. 

Another popular practical solution is Bloom's filter [I] . This data structure 
requires only 0(n) bits, whatever is the size of the universe; to answer a query 
we need to read O(l) bits from the memory. The drawback of this method is 
that we can get false answers to some queries. Only false positives answers are 
possible (for some x £ A Bloom's filter answers "yes"), but false negatives are 
not. When this technique is used in practice, it is believed that for a "typical" set 
A the fraction of false answers should be small. However, in many applications 
we cannot fix a priori any reasonable probability distribution on the family of 
all sets A and on the space of possible queries. 

An interesting alternative approach was suggested by Harry Buhrman, Peter 
Bro Miltersen, Jaikumar Radhakrishnan, and Venkatesh Srinivasan [IS]. They 
introduced randomness into the query processing algorithm. That is, the data 
structure remains static (it is deterministically defined for each set A) , but when 
a query is processed, we a toss coins and read randomly chosen bit from the 
memory. In this model, we allow to return a wrong answer with some small 
probability. Notice the sharp difference with the Bloom's filter: Now we must 
correctly reply to the query u x € A?" with probability close to 1 for each x. 

Buhrman, Miltersen, Radhakrishnan, and Venkatesh investigated both two- 



2 



sided and one-sided errors. In this paper we will concentrate mostly on one-sided 
errors: if x £ A, then the answer must be always correct, and if x ^ A, then a 
small probability of error is allowed. 

Recall that a trivial information-theoretic bound shows that the size of the 
structure representing a set A cannot be less that log (™) = f2(nlogm) bits. 
Surprisingly, this bound can be achieved if we allow two-sided errors and use 
only single bit probe for each query. This result was proven in |18) . We refer to 
the scheme proposed their as the BMRV-scheme: 

Theorem 1 (two-sided BMRV-scheme, |18j) For any e > there is a 
scheme for storing subsets A of size at most n of a universe of size to using 
O(-^logm) bits so that any membership query "Is x € A?" can be answered 
with error probability less than e by a randomized algorithm which probes the 
memory at just one location determined by its coin tosses and the query element 
x. 

The size of the memory achieved in this theorem is only a constant factor greater 
than the best possible. In fact, the trivial lower bound log (™) can be improved: 
the less is the probability of an error of the scheme, the more memory we need. 

Theorem 2 (lower bound, |18| ) (a) For any e > and — < to 1 / 3 , any e- 
error randomized scheme which answers queries using one bitprobe must use 

s P ace n ( s ioei/e lp g m )- 

(6) Any scheme with one-sided error e that answers queries using at most 

one bitprobe must use i Q g(„/ £ ) log to) bits of storage. 

Note that for one-sided error schemes the known lower bound is much stronger. 
Part (b) of the theorem above implies that we cannot achieve the size of space 
O(nlogm) with a one probe scheme and one-sided error. However we can get 
very close if we allow O(l) probes instead of a single probe: 

Theorem 3 (one-sided BMRV-scheme, |18| ) Fix any 5 > 0. There exists 
a constant t such that the following holds: There is a one-sided h-error ran- 
domized scheme that uses space 0(n 1+s log to) and answers membership queries 
with at most t probes. 

The constructions in [18] is not explicit: given the list of elements A, the 
corresponding scheme is constructed (with some brute force search) in time 
2P°iy( m ). Moreover, each membership query requires exponential in to compu- 
tations. 

The crucial element of the constructions in Theorem Q] is an unbalanced 
expander graph. Existence of a graph with required parameters was proven 
in [TB] probabilistically. We know that such a graph exists and we can find it 
by brute force search, but we do not know how to construct it explicitly. In 
case we have an effective construction of an expander with good parameters, 
we will get a practical variant of the BMRV-scheme. This scheme could be also 
generalized to build more complex data structures (see [23j for a construction 
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of a dictionary data structure based on the BMRV-scheme and some explicit 
expanders) . 

Since Bassalygo and Pinsker defined expanders [2] [3J, many explicit (and 
poly-time computable) constructions of expander graphs were discovered, see a 
survey [37]. However, most of the known constructions are based on the spectral 
technique that is not suitable to get an expander of degree d with an expansion 
parameter greater than d/2, see [13j. This is not enough for the construction 
used in the proof of Theorem Q] in |18j ; we need there a graph with expansion 
parameter close to d. 

There are only very few effective constructions of unbalanced graph with 
large expansion parameter. One of the known constructions was suggested by 
Capalbo et al in [21] ; its parameters are close to the optimal values if the size 
of the right part of the graph is constant times less than the size of the left 
part of the graph. However, in the BMRV-scheme we need a very unbalanced 
expander, i.e., a graph where the right part of the graph is much less than 
the left part; so, the technique from [3T] seems to be not suitable here. Some 
explicit version of the BMRV-scheme was suggested in [2J] (this construction 
involves Trevisan's extractor; note that Trevisan's extractor is known to be a 
good highly unbalanced expander, [TO]). The best known explicit construction 
of a highly unbalanced expander graph was presented in [3T]. It is based on 
the Parvaresh-Vardy code with an efficient list decoding. Thanks to the special 
structure if this expander, it enjoys some nice property of effective decoding. 
Using this technique, the following variant of Theorem [T] can be proven: 

Theorem 4 (|31j) For any 5 > there exists a scheme for storing subset A 
of size at most n of a universe of size m using n 1+s ■ poly(log m) bits so that 
any membership query can be answered with error probability less than e by a 
randomized algorithm which probes the memory at one location determined by 
its coin tosses and the query element x. 

Given the list of elements A, the corresponding storing scheme can be con- 
structed in time poly(logm, n). When the storing scheme is constructed, a query 
for an element x can be calculated in time poly(logm). 

In Theorems [1] [3] 0] a set A is encoded into a bit string, and when we want 
to know if x £ A, we just read from this string one randomly chosen bit (or O(l) 
bits in Theorem [3j. The obtained information is enough to decide whether x is 
an element of the set. Let us notice that in all these computations we implicitly 
use more information than just a single bit extracted from the memory. To 
make a query to the scheme and to process the retrieved bit, we need to know 
the parameters of the scheme: the size n of the set A, the size m of the universe, 
and the allowed error probability e. This auxiliary information is very short (it 
takes only log(m/e) bits), and it does not depend on the stored set A. We 
assume that this information is somehow hardwired into the bitprobe scheme 
(we shall say that this information is cached in advance by the algorithms that 
processes queries). 

In this paper we consider a more liberal model, where some small information 
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cached by the scheme can depend not only on n, m, and e, but also on the set A. 
Technically, the data stored in our scheme consists of two parts of different size: 
a small cached string C of length poly(logm), and a long bit string B of length 
n ■ polylog(m). Both these strings are prepared for a given set A of n elements 
(in the universe of size to) . When we need to answer a query "x G A?" , we use 
C to compute probabilistically a position in B and read there one bit. This is 
enough to answer whether x is an element of A, with a small one-sided error: 

Theorem 5 Fix any constant e > 0. There exists a one-sided e -error random- 
ized scheme that includes a string B of length 0(n log 2 to) and an auxiliary word 
C of length poly(logm). We can answer membership queries "x € A?" with 
one bit probe to B. For x G A the answer is always correct; for each x A 
probability of error is less than e. 

The position of the bit probed in A is computed from x and the auxiliary 
word C in time poly(logm). 

Remark 1: Schemes with 'cached' auxiliary information that depends on A 
(not only on its size n = \A\ and the size to of the universe) make sense only if 
the cached information is very small. Indeed, if the size of the cached data is 
about log ( m ) bits, then we can put there the list of all elements of A, so the 
problem becomes trivial. Since in our construction we need cached information 
of size poly(logTO) bits, the result is interesting when poly(logm) « n « m, 
e.g., for n — exp{poly(loglogn)}. Note that by Theorem [2] the space size 
0(n log 2 to) with one-sided error cannot be achieved by any schemes without 
cached auxiliary information that depends on A. 

Remark 2: The model of data structures with cached memory looks useful for 
practical applications. Indeed, most computer systems contain some hierarchy 
of memory levels: CPU registers and several levels of processor caches, then ran- 
dom access memory, flash memory, magnetic disks, remote network- accessible 
drives, etc. Each next level of memory is cheaper but slower. So, it is inter- 
esting to investigate the tradeoff between expensive and fast local memory and 
cheap and slow external memory. There is a rich literature on algorithms with 
external memory, see, e.g, surveys (TTl [28] . Thus, the idea of splitting the data 
structure into 'cached' and 'remote' parts is very natural and quite common 
in computer science. However, tradeoff between local and external memory is 
typically studied for dynamic data structures. The same time, it is not obvious 
that fast cache memory of negligible size can help to process queries in a static 
data structure. Since a small cache 'knows' virtually nothing about most ob- 
jects in the database, at first sight it seems to be useless. However, Theorem [5] 
shows that even a very small cache can be surprisingly efficient. 

Remark 3: In the proof of Theorem [5] we derandomize a probabilistic proof 
of existence of some kind of expander graphs. In many papers derandomization 
of probabilistic arguments involves highly sophisticated ad-hoc techniques. But 
we do derandomization in rather naive and straightforward way: take a value of 
a suitable pseudo-random bits generator and check that with high probability 
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(i.e., for most values of the seed) a pseudo-random objects enjoys the required 
property. In fact, we observe that several types of generators fit our construc- 
tion. Since the required property of a graph can be tested in AC , we can 
use the classic Nisan-Wigderson generator or (thanks to the recent result of 
Braverman [21]) any poly log- independent function. Also the required property 
of a pseudo-random graph can be tested by a machine with logarithmic space. 
Hence, we can use Nisan's generator [10] . Our idea of employing pseudo-random 
structures is quite similar to the construction of pseudo-random hash-functions 
in [3D]. We stress that we do not need any unproven assumptions to construct 
all these generators. 

In Theorem [5] we construct a scheme such that decoding is effective: when 
the scheme is prepared, we can answer queries "x € A?" in time polynomial in 
logm. However the encoding (preparing the database and the auxiliary word 
for a given set A) runs in expected time poly(m). We assume that n <C to, and 
the time polynomial in to seems to be too long. It is natural to require that 
encoding of the scheme runs in time poly(n,logm) (i.e., polynomial in the size 
of the encoded set A and the size of an index of each element in the universe) . 
The next theorem claims that the encoding time can be reduced if we slightly 
increase the space of the scheme: 

Theorem 6 The scheme from theorem [5] can be made effectively encodable in 
the following sense. Fix any constants e, 8 > 0. There exists randomized scheme 
that includes a bit string B of length n 1+l5 poly(log to) and an auxiliary word C 
of length poly (log to). We can answer membership queries "x € A?" with two 
bits probe to B. For x £ A the answer is always correct; for x ^ A probability 
of error is less than e . 

The position of the bit probed in A is computed by x and the auxiliary word 
C in time polylog(TO). Given A, the entire scheme {the string B and the word 
C) can be computed probabilistically in average time poly (n, log to). 

The rest of the paper is organized as follows. In Section 2 we remind the main 
ideas in the BMRV-scheme. We prove Theorem [S] in Section 3, and Theorem [5] 
in Section 4. In Section [5] we show that the proof of Theorem [2] (a) mutatis 
mutandis can be applied to our model with small cached memory. In Conclusion 
we discuss some open questions. 

2 How BMRV-scheme works. 

Let us explain the main ideas of the proof of Theorem[T]in [T5] . The construction 
is based on highly unbalanced expanders. 

Definition 1 A bipartite graph Q — (L,R,E) {with left part L, right part R 
and set of edges E) is called (to, s,d,k, 8) -expander if L consists of to vertices, 
R consists of s vertices, degree of each vertex in L is equal to d, and for each 
subset of vertices A C L of size at most k the number of neighbors is at least 
{l-8)d\A\. 
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We use a standard notation: for a vertex v we denote by T(v) the set of its 
neighbors; for a set of vertices A we denote by T(A) the set of neighbors of A, 
i.e., T(A) = \J V £aT(v). So, the definition of expanders claims that for all small 
enough sets A of vertices in the left part of the graph, > (1 — <5)ti|A| (the 

maximal size of |T(^4) | is obviously d\A\, since degrees of all vertices on the left 
are equal to d). The argument below is based on the following combinatorial 
property of an expander: 

Lemma 1 (see [21]) Let e be a positive number, and Q be an (m,s,d,k,S)- 
expander with S < e/4. Then for every subset A C L such that \A\ < k/2, the 
number of vertices x € L\A such that 

|r(a:)nr(A)| >ed 

is not greater than \A\/2. 

Let Q be a (to, s, d, k, 5)-expander with S < e/4. The storage scheme is defined 
as follows. We identify a set Ac {1, . . . , to} of size n (n < k/2) with a subset of 
vertices in the left part of the graph. We will represent it by some labeling (by 
ones and zeros) on the vertices of the right part of the graph. We do it in such a 
way that the vast majority (at least (1 — e)d) of neighbors of each vertex v from 
the left part of the graph correctly indicate whether v G A. More precisely, if 
v e A then at least (1 — e)d of its neighbors in R are labeled by 1; if v € L \ A 
then at least (1 — e)d of its neighbors in R are labeled by 0. Thus, querying a 
random neighbor of v will return the right answer with probability > 1 — e. 

It remains to explain why such a labeling exists. In fact, it can be constructed 
by a simple greedy algorithm. First, we label all neighbors of A by 1, and the 
other vertices on the left by 0. This labeling classifies correctly all vertices in 
A. But it can misclassify some vertices outside A: some vertices in L \ A can 
have too many (more than ed) neighbors labeled by 1. Denote by B the set of 
all these "erroneous" vertices. We relabel all their neighbors, i.e., all vertices 
in T(B) to 0. This fixes the problem with vertices outside A, but it can create 
problems with some vertices in A. We take the set of all vertices in A that 
became erroneous (i.e., vertices in A that have at least ed neighbors in T(B)), 
and denote this set of vertices by A' . Then, we relabel all T(A') to 1. This 
operation create new problems in some set of vertices B' C B, we relabel T(B') 
to 0, etc. In this iterative procedure we get a sequence of sets 

A D A' D A" D . . . 

whose neighbors are relabeled to 1 on steps 1,3,5,... of the algorithm, and 

B D B' D B" D . . . 

whose neighbors arc relabeled to on iterations 2, 4, 6, . . . respectively. Lemma[T] 
guarantees that the number of the erroneous vertices on each iteration reduces 
by a factor of 2 (\B\ < \A\/2, \A'\ < \B\/2, etc.). Hence, in log to steps the 
procedure terminates. 
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To organize a storing scheme (and to estimate its size) for a set A of size 
n in the universe of size to, we should construct an (m, s,d,k = 2n, S = e/4)- 
expander. Parameters m, k, 5 of the graph are determined directly by the pa- 
rameters of the desired scheme (by the size of A and the universe and the 
allowed error probability e). We want to minimize the size of the left part of 
the graph s, which is the size of the stored data. Existence of expanders with 
good parameters can be proven by probabilistic arguments: 

Lemma 2 Q18J) For all integers m,n and real e > there exists an (m,s — 
0( °f m ), d — lo s m ; n, e)-expander. Moreover, the vast majority of bipartite 
graphs with n vertices on the left, s — 10 °" '° s m vertices on the right, and degree 
d = ° s e m at all vertices on the left are such expanders. 

Given the parameters to, n, e, we can find an (to, 0( °f m ), lo s ? " ) n , e)-expander 
by brute force search. This can be done by a deterministic algorithm in time 
2 poi y (m/e) H ence, we can construct the bit-probe structure defined above in 
exponential time. Moreover, when the structure is constructed and we want to 
answer a query ll x £ A?" , we need to read only one bit from the stored bit 
string. But to select the position of this bit we need again to reconstruct the 
expander graph, which requires exponential computations. We could keep the 
structure of the computed graph in "cache" (compute the graph once, and then 
re- use it every time a new query should be answered). However, in this case 
the size of the "cached data" (the size of the graph) becomes much greater than 
m, which makes the bit-probe scheme meaningless (it is cheaper to store A as 
a trivial m-bits array). 

In [21] a nice and very powerful explicit construction of expanders was sug- 
gested: 

Theorem 7 (|31j) Fix an e > and 5 > 0. For all integers m,n there exists 
an explicit (m,s — n 1+s ■ poly(logTO), d = poly (log m), n, s)- expander such that 
for an index of a vertex v from the left part (a binary representation of an integer 
between 1 and to) and an index of an outgoing edge (a binary representation 
of an integer between 1 and d), the corresponding neighbor on the right part of 
the graph (an integer between 1 and s) can be computed in time polynomial in 
log TO. 

Also, the following effective decoding algorithm exists. Given a set of vertices 
T from the right part of the graph, we can compute the list of vertices in the left 
part of the graph that have at least (Aed) neighbors in T, i.e., 

S = {v : \T(v)nT\ > Aed}, 

in time poly(|£|,n, logm). 

Theorem [5] is proven by plugging the expander from Theorem [7] in the general 
scheme explained above, see details in [31]. 
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3 Proof of Theorem O 



3.1 Refinement of the property of ^-reduction. 

The construction of a bit-probe scheme for a set A of size n in the m-elements 
universe (with probability of an error bounded by some e) explained in the 
previous section involves an (m, s, d, k, 5)-expander with s = O(^j-logm) and 
d = 0(- logm). Such a graph contains dm edges (degree of each vertex on the 
left is d). The list of all its edges can be specified by a string of dm log s bits: we 
sort all edges by their left ends, and specify for each edge its right end. Denote 
the size of the description of this graph by N = dm logs. 

In what follows we will assume that number s is a power of 2 (this will 
increase the parameters of the graph only by a factor at most 2). So, we 
may assume that every string of N(m,s,d) = dm logs bits specifies a bipar- 
tite graph with m vertices on the left, s vertices on the right and degree d on 
the left. Lemma [2] claims that most of these bits string of length N describe 
an (m, s, d, k, (J)-expander. By Lemma [TJ if a graph is an expander with these 
parameters, then for e = 45 and for every set A C L of size less than k/2 the 
following reduction property holds: 

Combinatorial Property 1 (e-reduction property) For every subset A C 
L such that \A\ < k/2, the number of vertices x € L \ A such that 

|r(x)nr(A)| > ed 

is not greater than \A\/2. 

This property was the main ingredient of the BMRV-scheme. In our bit-probe 
scheme we will need another variant of Property [TJ 

Combinatorial Property 2 (strong e-reduction) Let Q = (L, R, E) be a 

bipartite graph, and A C L be a subset of vertices from the left part. We say 
that the strong e-reduction property holds for A in this graph if for all x € L\A 

\T{x)nV{A)\ <ed. 

Lemma 3 Fix an e > 0. For all integers m, n, for every A C {1, . . . , m} of 
size n there exists a bi-partite graph Q = (L,R,E) such that 

• \L\ = m (the size of the left part); 

• \R\ = 2d 2 n = 0(n log 2 m) (the size of the right part); 

• degree of each vertex in the left part is d — 2lo & m = O(logm); 

• the property of strong e-reduction holds for the set A (we identify it with 
a subset of vertices in left part of the graph) . 

Moreover, the property of strong e-reduction for A holds for the majority of 
graphs with the parameters specified above. 
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The order of quantifiers is important here: we do not claim that in a random 
graph the strong e-reduction property holds for all A; we say only that for every 
A the strong e-reduction is true in a random graph. 

Proof of lemma: Let v be any vertex in L \ A. We estimate probability that 
at least ed neighbors of x are at the same time neighbors of A (assuming that 
all edges are chosen at random independently). There are (.) choices of ed 
vertices among all neighbors of v. Hence, 

»,nr W ,^ S (i).(J^l)-^.(^)--(i)" k - 

This probability is less than 1/m 2 (for each vertex v). So, the expected number 
of vertices v £ L such that \T(v) PI L(A)| > ed, is less than 1/m < 1/2. Hence, 
the strong e- reduction property holds for A for more than 50% of graphs. 

3.2 Testing the property of strong e-reduction. 

Lemma [3] implies that a graph with the strong reduction property for A exists. 
Given A, we can find such a graph by brute force search. But we cannot use 
such a graph in our bit-probe scheme even if we do not care about computation 
complexity: the choice of the graph depends on A, and the size of the graph is too 
large to embed it into the scheme explicitly. We need to find a suitable graph 
with a short description. We will do it using pseudo-random bits generators 
('pseudo-random' graphs will be parameterized by the seed of a generator). 

Property [2] is a property of a graph and of a set of vertices A in this graph. 
We can interpreted it as a property of an iV-bits string (that determines a graph) 
and some A C {l,...,m}. Lemma [3] claims that for every A, for a randomly 
chosen graph (a randomly chosen A^-bits string) with high probability the strong 
reduction property is true. We want to show that the same is true for a pseudo- 
random graph. At first, we observe that the strong reduction property can be 
tested by an AC circuit (a Boolean circuit of bounded depth, with polynomial 
number of gates and, or with unbounded fan-in, and negations) . 

Indeed, we need to check for each vertex v € L \ A that the number of 
vertices in T(v) (1 T(A) is not large. For each vertex w in the right part of the 
graph we can compute by an AC°-circuit whether w £ T(j4): 

\/ y [ the i-th neighbor of u is w 1 

u£A i<d 

(the condition in the square brackets is a statement about one edge in a graph, 
i.e., it is a conjunction of 0(log A^) bits and negations of bits from the represen- 
tation of this graph). So, for each v E L\A and for i = 1, . . . ,d we can compute 
whether the i-th neighbor of v belongs to L(A): 

bv.i = y ([ the i-th neighbor of v is w ] & [ w g T(A) ]) 
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It remains to 'count' for each v E L \ A the number of neighbors in T(A) and 
compute the thresholds 



Th(6 Vj i, 




1, if b Vt i H h b Vtd > ed, 

0, otherwise. 



In AC we cannot compute thresholds with linear number of inputs (e.g., the 
majority function is not in AC , see [5]). However, we need threshold functions 
with only logarithmic number of inputs. Such a function can be represented by 
a CNF of size 2°W = poly(A^). 

Then, we combine together these thresholds for all v G L \ A, and get an 
AC°-circuit that tests the property of strong e-reduction. 

3.3 Pseudo-random graphs. 

We need to generate a pseudo-random string of N bits that satisfies the strong 
e-reduction property (for some fixed set A). We know that (i) by Lemma|31 for a 
uniformly distributed random string this property is true with high probability, 
and (ii) this property can be checked in AC . It remains to choose a pseudo- 
random bits generator that fools this particular AC°-circuit. There exist several 
generators that fools such distinguishers. Below we mention three different 
solutions. 

Remark: We can test by an AC°-circuit Property [5] for every fixed set A but 
not for all sets A together. 

The first solution: the generator of Nisan and Wigderson. The classic way to 
fool an AC circuit is the Nisan-Wigderson generator: 

Theorem 8 (Nisan— Wigderson generator, [12j) For every constant c there 
exists an explicit family of functions 



such that for for any family of circuits Cn (with N inputs) of polynomial in N 
size and depth c, the difference 



tends to zero (faster than l/poly(iV)). 

The generator is effective: generator 's value G m (x) can be computed from a 
given x in time poly(logiV). 

From this theorem and Lemma [3] it follows that for each Ac {1, . . . , m} of size 
at most n, for most values of the seed of the Nisan-Wigderson generator G m , a 
pseudo-random graph G m (x) satisfies the strong e-expansion property for A. 

The second solution: polylog-independent strings. M. Braverman proves that all 
polylog-independent functions fool AC°-circuits: 



G m : {O,!}? 01 ^ 1 ^) -> {0,1} 



N 



\PT0b y [C N (y) = 1] 



Pmb z [C N {G m (z)) = l}\ 
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Theorem 9 (|29j) Let C be a Boolean circuit of depth r and size M, s be a 
positive number, and 



(for some absolute constant k) . Then C cannot distinguish between the uniform 
distribution U and any D -independent distribution fj, on its inputs: 

|Prob /J [C(x) accepts a fi-random x] — Probu[C(x) accepts a U -random x]\ < e. 

It follows that instead of the Nisan-Wigderson generator we can take any 
(log c n)-independent function (for large enough constant c). The standard way 
to generate r-independent bits (in our case we need r = log c n) is a polynomial 
of degree r over a finite field of size about N. Seeds of this 'pseudo-random 
bits generator' are coefficients of a polynomial. Also, other (more computation- 
ally effective) constructions of polylog-independent functions can be used. E.g., 
the construction from [9j[14] provides a family of (log c n)-independent functions 
with very fast evaluation algorithm, and each function is specified by poly (log n) 
bits (so, the size of the seed is again poly- logarithmic). 

The third solution: the generator of Nisan. The property of strong e-reduction 
can be tested by a Turing machine with logarithmic working space. Technically 
we need a machine with 

• advice tape: read-only, two-way tape, where the list of elements of A is 



• input tape: read-only tape with random (or pseudo-random) bits, with 
logarithmic number of passes (the machine is allowed to pass along the 
input on this tape only 0(log N) times); 

• index tape: read-only, two-way tape with logarithmic additional informa- 



• work tape that is two-way and read- write; the zone of the working tape is 
restricted to O(logiV). 

We interpret the content of the input tape as a list of edges of a random (or 
pseudo-random) graph Q = (L,R). The content of the index tape is understood 
as an index of a vertex v G L \ A. The machine reads the bits from the 'input 
tape' (understood as a list of edges of a random graph) and checks that the vast 
majority of neighbors of v does not belong to the set of neighbors of A. The 
machine needs to read the input tape 2d = 0(log N) times (where d is degree of 
v): on the first pass we find the index of the first neighbor of v; on the second 
pass we check whether this neighbor of v is incident to any vertex of A; then 
we find the second neighbor of v, check whether it is is incident to any vertex 
of A, etc. The machine accepts the input if \T(v) n T(A)\ < ed. 

We can use Nisan's generator [10] to fool this machine. Indeed, this checking 
procedure fits the general framework of [22], where Nisans generator was used 





written; 



tion; 
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to dcrandomize several combinatorial constructions. The only nonconventional 
feature in our argument is that the input tape is not read- once: we allow to 
read the tape with random bits logarithmic number of time£0. But we can 
apply Nisan's technique for a machine that reads random bits several times. 
David, Papakonstantinou, and Sidiropoulos observed (see [33]) that a log-space 
machine with logarithmic (and even poly-logarithmic) number of passes on the 
input tape is fooled by Nisan's generator with a seed of size poly (log N). 

Now we are ready to prove Theorem [SJ We fix an e > and a set A C 
{1,...,to} of size m. Let G m be one of the pseudo-random bits generators 
discussed above. For all these generators, for most values of the seed z the 
values G m (z) encodes a graph such that the strong e-reduction property holds 
for A. Let us fix one of such seeds. We label by 1 all vertices in T(A) and by 
all other vertices in R in the graph encoded by the string G m (z). 

The seed value z makes the "auxiliary word" C, and the specified above 
labeling of the right part of the graph makes the bit string B. To answer a 
query "x € A?" we take a random neighbor of x in the graph and check its 
label. If the label is 1, we answer "x € A"; otherwise u x g" A" . 

If x G A, then there are no errors, since all neighbors of A are labeled by 
1. If x g" A, then probability of an error is bounded by e because of the strong 
e-reduction property. We can answer a query in time poly(logm) since the 
generators under consideration are effectively computable. 

3.4 Complexity of encoding 

The disadvantage of this construction is non-effective encoding procedure. We 
know that for most seeds z the corresponding graph G m {z) enjoys the strong 
e-reduction property. However, we need the brute force search over all vertices 
v £ L \ A (polynomial in m but not in log m) to check this property for any 
particular seed. Thus, we have a probabilistic encoding procedure that runs in 
expected time poly (to): we choose random seeds until we find one suitable for 
the given A. 

In the next section we explain how to make the encoding procedure more 
effective (in expected time poly(n, log m)) for the following price: we will need 
a slightly greater size of the data storage, and we will take 2 bit probes instead 
of one at each query. 

4 Proof of theorem [6t effective encoding. 

To obtain a scheme with effective encoding and decoding we combine two con- 
structions: the explicit expander from |31j and a pseudo-random graph from 
the previous section. 

1 The same argument can be presented in a more standard framework, with a read-once 
input tape and an index tape of poly-logarithmic size. However, we believe that the argument 
becomes more intuitive when we allow many passes on the input tape. 
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We fix an n-element set A in the universe {1, . . . , m}. Now we construct 
two bipartite graphs Q\ and Q 2 that share the same left part L — {1, . . . , m}. 
The first graph is the explicit (m,s — n 1+s ■ poly(logm), d = poly (log m), n, e)- 
expander Q\ = (L, R\,Ex) from [21] with an effective decoding algorithm. We do 
the first two steps from the encoding procedure of the BMRV-scheme explained 
in Section [21 At first we label all vertices in T(A) by 1 and other vertices by 0. 
Denote the corresponding labeling (which is a n 1+s ■ poly(logrn)-bits string) by 
B\. Then, we find the list of vertices outside A that have too many 1-labeled 
neighbors: 

W :={v€L\A : \T(x) n T(A)\ > ed}. 

We do not re-label neighbors of W, but we will use this set later (to find W 
effectively, we need the property of effective decoding of the graph). 

Let v € L be a vertex in the left part of the graph. There are three different 
cases: 

• if v belongs to A then all neighbors of v are labeled by 1; 

• ifw does not belong to AUW, then a random neighbor of x with probability 
> (1 — e)d is labeled by 0; 

• if v belongs to W, we cannot say anything certain about labels of its 
neighbors. 

Thus, if we take a random neighbor of v and see label in B\ , then we can say 
that this point does not belong to A. If we see label 1, then a more detailed 
investigation is needed. This investigation will involve the second part of the 
scheme, which we define below. 

Now our goal is to distinguish between A and W . To this end, we take a 
pseudo-random graph Q 2 = (L, R 2 , E 2 ) specified by a value of a pseudo-random 
bits generator G m (z) (any of the generators discussed in the previous section is 
suitable). We need a restricted on W version of the strong e-reduction property: 

For every v € W , at most ed vertices in T(v) belong to T(A). 

Set W is of size at most \A\/2 (Lemma [I}, an d it can be effectively computed 
from A (effective decoding property of the graph Q\). Hence, for a given z we 
can check the property above in time poly (n, log m). We know that for the 
majority of seeds z, the graph G m {z) satisfies the strong e-reduction property, 
i.e., all vertices outside A have at most ed neighbors in T(A). Though we 
cannot effectively check this general property (we cannot check it effectively for 
all vertices in the universe), we are able to check its restricted version (i.e., only 
for vertices in W). 

Thus, in average time poly (n, log m) we can probabilistically find some seed 
z such that the restricted (on W) version of the strong e-reduction property 
is true. In the corresponding graph Q 2 we label by 1 all vertices in T(A), and 
by all vertices of the right part of the graph outside T(A). We denote this 
labeling (a 0(m log 2 n)-bits string) by B 2 and take it as the second part of the 
data storage. The corresponding seed value z is taken as 'cached' memory. 
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The decoding procedure works as follows. Given x G {1, . . . , to}, we take its 
random neighbor in both constructed graphs and look at their labels (bits from 
B\ and Bi respectively). 

• if the first label is 0, we say that x $ A; 

• if the first label is 1 and the second bit is then we say that x $ A. 

• if both labels are is 1 then we say that x G A. 

If x $ AUW, then by the definition of W we know that the procedure above with 
probability > (1 — e) returns the correct answer. If x G A, then by construction, 
both labels are equal to 1, and the procedure returns the correct answer with 
probability 1. If x G W, then we have no guarantee about labels in B\\ but 
from the restricted strong reducibility property it follows that with probability 
> (1 — e) the second label is 0. Thus, we have one-sided error probability 
bounded by e. 

5 A lower bound for schemes with cached mem- 
ory. 

In [T5] the lower bound f2(n 2 logm) was proven for one-probe schemes with 
one-sided errors. This result cannot be applied to schemes with small "cached" 
memory dependent on A. In fact, our scheme of size f2(nlogm) with a cache of 
size poly(logm) bits (from Theorem[S]) is below this bound for n log°^ to. 

On the other hand, the proof of the lower bound ^( e i og "i/ r ) logm) (theo- 
rem 2 in [18) ) with minimal changes works for schemes with cached data if the 
size of the pre-computed and cached information is much less than n log to: 

Theorem 10 We consider randomized schemes that store sets of n elements 
from a universe of size m, with two-parts memory (the cached memory of size 
poly (log to) and the main storage). 

For all constant e < 1, i/poly(logm) <C n <C \/m, then any such scheme 
with error probability e (possibly with two-sided errors) that answers queries 
using cached memory of size poly log(m) and one bit probe to the main storage, 
must use space ^( e i og "i/ e ) logm). 

Proof: We follow the arguments from theorem 2 in [TJ] (preserving the 
notation). Consider any randomized scheme with two-parts memory. Denote 
by C the cached memory (of size poly(log to)) and by B the main part of the 
memory of size s (the scheme answers queries with one bit probe to C). Our 
aim is to prove a lower bound for s. 

The proof is based on the bound for the size of cover-free families of sets 
proven by Dyachkov an Rykov [4]. Let us remind that a family of sets F is 
called called r-cover-free if /o % fx U . . . U f r for all distinct /o, . . . , f r € F. 
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First we take a large enough i-cover free family F of sets of size n from the 
universe {l,...,m}. The construction from ;7, theorem 3.1] guarantees that 
there exists a family F such that 

t^en log -2|-+0(log m) 

By assumption, each set / G F can be represented in our scheme by some pair 
(B, C) (the main storage and the cached memory). Notice that 

\F\ = 2 £nlog i^+°( lo s rn ) ^> 2l s l — 2 poly ^ losm ^ 

Hence, the exists some value C and some F' C F of size 

W\ > |F|/2 poly< - logm ' 1 = 2 sl ( enlogm ) 

such that all sets from the family F' share in our scheme the same value C of 
the cached memory. Further, we repeat word for word the proof of theorem 2 
from [18 j with substitute F' instead of F. 




6 Conclusion. 

In this paper we constructed an effective probabilistic bit-probe scheme with 
one-sided error. The used space is close to the trivial information-theoretic 
lower bound Q(nlogm). The scheme answers queries "x € A?" with a small 
one-sided error and requires only poly-logarithmic (in the size of the universe) 
cached memory and one bit (two bits in the version with effective encoding) 
from the main part of the memory. 

For reasonable values of parameters (for n 3> poly (log m)) the size of our 
scheme 0(nlog 2 m) with a cache of size poly(logm) is below the lower bound 
Q(n 2 logm) proven in [18] for one- probe schemes with one-sided errors without 
cached data dependent on A. The gap between our upper bounds and the trivial 
lower bound is a (log ™)-factor. 

The following questions remain open: How to construct a bit-probe memory 
scheme with one-sided error and effective encoding and decoding that requires 
to read only one bit from the main part of memory to answer queries? What 
is the minimal size of the cached memory required for a bit-probe scheme with 
one-sided error, with space of size O(nlogm)? 

The author thanks Daniil Musatov for useful discussions, and anonymous ref- 
erees of CSR2011 for deep and very helpful comments. 
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